Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PollCheckKnativeStatusFunc check if observedGeneration is less-than instead of not-equal #14087

Conversation

modular-magician
Copy link
Collaborator

When updating a cloud-run service, the PollCheckKnativeStatusFunc checks whether the status.observedGeneration field matches the metadata.generation field to assert that an update operation has finished. The value of metadata.generation is read once (from the response to the PUT operation that updates the service) and not refreshed.

However if a new revision is created during the polling-period then the PollCheckKnativeStatusFunc may never see the generation value that it is looking for.

We have observed this when cloud-run services are managed by terraform but also deployed "out-of-band" by other CI pipelines (e.g. deploying a new container from Github Actions).

  1. Terraform creates a new revision, observedGeneration is 1, generation is 2 (expected generation is 2)
  2. The operation completes and observedGeneration and generation are now 2
  3. Another actor creates a new revision, generation is now 3 but observedGeneration is 2
  4. The second operation completes observedGeneration and generation are now 3
  5. Terraform polls for the latest observedGeneration and sees that 3 != 2 (expected generation) so continues polling until the timeout.

Since the generation value is a number that increases predictably, it should be safe to use a less-than comparison, so that we can end the poll once the expected generation has passed.

If this PR is for Terraform, I acknowledge that I have:

  • Searched through the issue tracker for an open issue that this either resolves or contributes to, commented on it to claim it, and written "fixes {url}" or "part of {url}" in this PR description. If there were no relevant open issues, I opened one and commented that I would like to work on it (not necessary for very small changes).
  • Ensured that all new fields I added that can be set by a user appear in at least one example (for generated resources) or third_party test (for handwritten resources or update tests).
  • Generated Terraform providers, and ran make test and make lint in the generated providers to ensure it passes unit and linter tests.
  • Ran relevant acceptance tests using my own Google Cloud project and credentials (If the acceptance tests do not yet pass or you are unable to run them, please let your reviewer know).
  • Read the Release Notes Guide before writing my release note below.

Release Note Template for Downstream PRs (will be copied)

cloudrun: Fixed race condition when polling for status during an update of a `google_cloud_run_service`

Derived from GoogleCloudPlatform/magic-modules#7520

@modular-magician modular-magician merged commit bbc779d into hashicorp:main Mar 24, 2023
@github-actions
Copy link

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant